🛡️ AI Safety - codenm.no2 · Scour

Evaluating Human-AI Safety: A Framework for Measuring Harmful Capability Uplift 🤝Human-AI Collaboration

arxiv.org·2d·

The Ethics of Artificial Intelligence ⚖️AI Ethics

hackettpublishing.com·3d·

The Ethics Theater of AI: Why Switching From ChatGPT to Claude Changes Less Than You Think ⚖️AI Ethics

hackernoon.com·18h·

Protecting people from harmful manipulation 🛡️AI Security

deepmind.google·6d·Hacker News·

When AI turns software development inside-out: 170% throughput at 80% headcount ⚡Code Generation

venturebeat.com·4d·

Detection of Adversarial Attacks in Robotic Perception 🛡️AI Security

arxiv.org·2d·

A Revealed Preference Framework for AI Alignment 👁️Multimodal AI

arxiv.org·2d·

Rethinking AI Literacy Education in Higher Education: Bridging Risk Perception and Responsible Adoption ⚖️AI Ethics

arxiv.org·1d·

A Provable Energy-Guided Test-Time Defense Boosting Adversarial Robustness of Large Vision-Language Models 🛡️AI Security

arxiv.org·2d·

AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective 🛡️AI Security

arxiv.org·6d·

Lipschitz verification of neural networks through training ✓Formal Verification

arxiv.org·2d·

A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI 🤖Agentic AI

arxiv.org·2d·

A Unified Memory Perspective for Probabilistic Trustworthy AI 🛡️AI Security

arxiv.org·6d·

Position: Explainable AI is Causality in Disguise ⚖️AI Ethics

arxiv.org·2d·

Information-Theoretic Limits of Safety Verification for Self-Improving Systems ✓Formal Verification

arxiv.org·2d·

BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments 🎯AI Agents

arxiv.org·3d·

Working Paper: Towards a Category-theoretic Comparative Framework for Artificial General Intelligence 🎭Anthropic Claude

arxiv.org·1d·

FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants 👁️Multimodal AI

arxiv.org·3d·

Safeguarding LLMs Against Misuse and AI-Driven Malware Using Steganographic Canaries 🛡️AI Security

arxiv.org·2d·

Response-Aware Risk-Constrained Control Barrier Function With Application to Vehicles 🤝Human-AI Collaboration

arxiv.org·6d·

Loading more...